add persistentStorage as Storage first class#1394
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds ChangesNodePersistentStorage full storage backend
Sequence Diagram(s)sequenceDiagram
participant ComputeHandler as startCompute / initialize
participant resolveComputeFileObject
participant PersistentStorageFactory as PersistentStorageFactory (LocalFS)
participant NodePersistentStorage
participant DockerC2D as compute_engine_docker
ComputeHandler->>resolveComputeFileObject: resolve(algorithm or dataset)
resolveComputeFileObject-->>ComputeHandler: BaseFileObject (or null)
ComputeHandler->>ComputeHandler: ensureConsumerAllowedForPersistentStorageLocalfsFileObject
ComputeHandler->>DockerC2D: start job
DockerC2D->>resolveComputeFileObject: resolve asset fileObject
resolveComputeFileObject-->>DockerC2D: BaseFileObject
alt type === nodePersistentStorage
DockerC2D->>PersistentStorageFactory: getReadableStream(bucketId, fileName, consumerAddress)
PersistentStorageFactory-->>DockerC2D: ReadableStream → bind mount
else other type
DockerC2D->>NodePersistentStorage: getStorageClass(file, config, consumerAddress)
NodePersistentStorage-->>DockerC2D: StorageReadable → download
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/components/core/compute/utils.ts (1)
51-79:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftKeep persistent-storage ACL errors structured.
Line 51 throws for a missing
consumerAddress, but the catch on Lines 76-79 converts that and any persistent-storage ACL/config failure fromfetchSpecificFileMetadata()into empty checksums. The compute handlers then report a generic checksum failure instead of preserving the 400/403 authorization semantics. Let persistent-storage authorization/config errors propagate as typed/structured errors, and only fall back to empty checksums for ordinary checksum lookup misses.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/core/compute/utils.ts` around lines 51 - 79, The current catch block indiscriminately converts all errors into empty checksums, but persistent-storage authorization/config errors should propagate as structured errors to preserve HTTP status semantics. Modify the error handling to distinguish between persistent-storage related errors (the error thrown for missing consumerAddress when file type is nodePersistentStorage, and authorization errors from the fetchSpecificFileMetadata call) and other errors. Let persistent-storage errors re-throw as typed/structured errors so handlers can report proper HTTP status codes, while only catching and returning empty checksums for ordinary checksum lookup failures that don't involve storage authorization or configuration issues.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/components/c2d/compute_engine_docker.ts`:
- Around line 1971-1974: Remove the line that assigns the decrypted resolved
value back to asset.fileObject within the persistent-storage handling code.
Instead of persisting the plaintext decrypted data (bucketId and fileName) to
the job record through asset.fileObject, keep the resolved variable local to the
current scope. The subsequent line that declares fo as a type-cast of resolved
should remain, but ensure resolved is only used locally for the current
operation, allowing uploadData to independently re-detect encrypted and
DDO-derived persistent-storage assets without storing plaintext credentials in
the persisted job data.
In `@src/components/core/compute/initialize.ts`:
- Around line 225-229: The
ensureConsumerAllowedForPersistentStorageLocalfsFileObject helper in the loop
iterating over task.algorithm and task.datasets only validates plaintext
nodePersistentStorage file objects, causing encrypted file objects and
documentId/serviceId persistent-storage references to skip ACL validation at
initialize time. Before calling
ensureConsumerAllowedForPersistentStorageLocalfsFileObject, resolve the
effective file object for each elem by applying the same decrypt and DDO
resolution step that the Docker path uses later, then pass the resolved file
object to the ACL helper instead of the raw elem.fileObject so that denied
consumers fail with the expected 403 error during initialization.
In `@src/components/core/compute/startCompute.ts`:
- Around line 234-238: The ACL preflight validation in the loop iterating
through task.algorithm and task.datasets only checks raw elem.fileObject, which
bypasses encrypted file objects and persistent-storage references using
documentId or serviceId. Resolve and decrypt file objects first before passing
them to ensureConsumerAllowedForPersistentStorageLocalfsFileObject, or extend
that helper function to handle encrypted and DDO-resolved persistent-storage
inputs directly. Apply this fix to both loop occurrences (the one checking
task.algorithm and task.datasets, and the second loop at lines 782-786) to
ensure all compute file-object representations are validated before jobs are
accepted or funds are locked.
In `@src/components/core/handler/ddoHandler.ts`:
- Around line 779-780: Remove the two console.log debug statements at lines
779-780 in the ddoHandler.ts file that log the validation object directly and as
stringified JSON. These calls bypass the module logger and pollute stdout.
Replace them with a single call to CORE_LOGGER (imported from
src/utils/logging/common.ts) at an appropriate log level (such as debug) with a
meaningful message and the validation data, or remove them entirely if they are
no longer needed for debugging purposes.
In `@src/components/core/handler/downloadHandler.ts`:
- Around line 68-72: The type comparison in the conditional block starting with
`(task.fileObject as { type?: string })?.type ===
FileObjectType.NODE_PERSISTENT_STORAGE` is case-sensitive, but
Storage.getStorageClass accepts storage types case-insensitively. To prevent
case variations from bypassing the consumerAddress guard, normalize the case of
the storage type string before comparing it to
FileObjectType.NODE_PERSISTENT_STORAGE. Convert the type property to a
consistent case (such as lowercase or uppercase) before the equality check to
ensure the 403 guard is enforced regardless of the input casing.
In `@src/components/core/handler/fileInfoHandler.ts`:
- Line 30: The file type check at line 30 in the fileInfoHandler function only
validates the embedded file.type property directly without also verifying it
against the normalized command.type, which allows requests to bypass
restrictions by embedding FILE_OBJECT_TYPE.NODE_PERSISTENT_STORAGE while setting
an allowed top-level type. To fix this, ensure the persistent storage access
check validates both the command type (normalized to handle casing variants) and
the embedded file type together, similar to how formatMetadata handles
normalization. Apply the same validation logic to the additional check at lines
77-82 to prevent casing variants from bypassing the storage routing gates.
In `@src/components/persistentStorage/PersistentStorageFactory.ts`:
- Around line 200-218: The consumerAddress parameter in both getFileInfo and
getReadableStream abstract methods is currently optional, which allows callers
to bypass ACL enforcement by simply omitting this parameter. Make
consumerAddress a required parameter by removing the optional marker (?) from
its definition in both method signatures, or alternatively split these methods
into separate implementations where one provides explicit trusted internal
access without ACL checks and another requires consumerAddress as a mandatory
parameter for ACL-enforced access.
In `@src/test/integration/persistentStorage.test.ts`:
- Around line 496-505: The try-catch block in the
intruderStorage.getReadableStream() test catches all exceptions and treats any
error as an access denial, which can mask unexpected errors. Instead of a bare
catch block, catch the exception and verify it is specifically an access denial
error (check the error type or message for access control related details). Only
set denied to true if the caught error is the expected access denial error, and
fail or re-throw if it is an unexpected error type.
- Around line 404-412: The try-catch block in the test is catching all
exceptions without validating the specific error type or message, which could
mask unrelated failures as false positives. Instead of setting denied to true
for any caught exception, verify that the specific error thrown by
getFileChecksum indicates an access denial error by checking the error message
or error type/code (such as checking if the error contains specific text like
"access denied" or "forbidden" or if it matches a specific error code). Use
expect assertions to validate the error details rather than just relying on the
catch block to confirm the denial occurred.
---
Outside diff comments:
In `@src/components/core/compute/utils.ts`:
- Around line 51-79: The current catch block indiscriminately converts all
errors into empty checksums, but persistent-storage authorization/config errors
should propagate as structured errors to preserve HTTP status semantics. Modify
the error handling to distinguish between persistent-storage related errors (the
error thrown for missing consumerAddress when file type is
nodePersistentStorage, and authorization errors from the
fetchSpecificFileMetadata call) and other errors. Let persistent-storage errors
re-throw as typed/structured errors so handlers can report proper HTTP status
codes, while only catching and returning empty checksums for ordinary checksum
lookup failures that don't involve storage authorization or configuration
issues.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d0ac2928-ca9e-466d-aa57-88dc5eb423c9
📒 Files selected for processing (18)
src/@types/PersistentStorage.tssrc/@types/fileObject.tssrc/components/c2d/compute_engine_docker.tssrc/components/core/compute/initialize.tssrc/components/core/compute/startCompute.tssrc/components/core/compute/utils.tssrc/components/core/handler/ddoHandler.tssrc/components/core/handler/downloadHandler.tssrc/components/core/handler/fileInfoHandler.tssrc/components/persistentStorage/PersistentStorageFactory.tssrc/components/persistentStorage/PersistentStorageLocalFS.tssrc/components/persistentStorage/PersistentStorageS3.tssrc/components/storage/NodePersistentStorage.tssrc/components/storage/Storage.tssrc/components/storage/getStorageClass.tssrc/components/storage/index.tssrc/test/integration/compute.test.tssrc/test/integration/persistentStorage.test.ts
| /** | ||
| * Stat-like metadata for a bucket file. ACL is enforced only when | ||
| * `consumerAddress` is provided (mirrors `getDockerMountObject`). | ||
| */ | ||
| public abstract getFileInfo( | ||
| bucketId: string, | ||
| fileName: string, | ||
| consumerAddress?: string | ||
| ): Promise<{ size: number; lastModified: number }> | ||
|
|
||
| /** | ||
| * Returns a readable stream of a bucket file's contents. ACL is enforced only | ||
| * when `consumerAddress` is provided. Backs the NodePersistentStorage class. | ||
| */ | ||
| public abstract getReadableStream( | ||
| bucketId: string, | ||
| fileName: string, | ||
| consumerAddress?: string | ||
| ): Promise<NodeJS.ReadableStream> |
There was a problem hiding this comment.
Require consumerAddress for file metadata and stream access.
The new contract documents that ACL checks are skipped when consumerAddress is omitted. Since NodePersistentStorage and Storage.getStorageClass() also carry this value as optional, a caller can stat or stream a persistent-storage object without an ACL context. Make consumerAddress required here, or split trusted internal no-ACL access into a separate explicit API.
🔒 Proposed contract tightening
/**
- * Stat-like metadata for a bucket file. ACL is enforced only when
- * `consumerAddress` is provided (mirrors `getDockerMountObject`).
+ * Stat-like metadata for a bucket file. ACL must be enforced for the
+ * provided consumer address.
*/
public abstract getFileInfo(
bucketId: string,
fileName: string,
- consumerAddress?: string
+ consumerAddress: string
): Promise<{ size: number; lastModified: number }>
/**
- * Returns a readable stream of a bucket file's contents. ACL is enforced only
- * when `consumerAddress` is provided. Backs the NodePersistentStorage class.
+ * Returns a readable stream of a bucket file's contents. ACL must be enforced
+ * for the provided consumer address. Backs the NodePersistentStorage class.
*/
public abstract getReadableStream(
bucketId: string,
fileName: string,
- consumerAddress?: string
+ consumerAddress: string
): Promise<NodeJS.ReadableStream>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Stat-like metadata for a bucket file. ACL is enforced only when | |
| * `consumerAddress` is provided (mirrors `getDockerMountObject`). | |
| */ | |
| public abstract getFileInfo( | |
| bucketId: string, | |
| fileName: string, | |
| consumerAddress?: string | |
| ): Promise<{ size: number; lastModified: number }> | |
| /** | |
| * Returns a readable stream of a bucket file's contents. ACL is enforced only | |
| * when `consumerAddress` is provided. Backs the NodePersistentStorage class. | |
| */ | |
| public abstract getReadableStream( | |
| bucketId: string, | |
| fileName: string, | |
| consumerAddress?: string | |
| ): Promise<NodeJS.ReadableStream> | |
| /** | |
| * Stat-like metadata for a bucket file. ACL must be enforced for the | |
| * provided consumer address. | |
| */ | |
| public abstract getFileInfo( | |
| bucketId: string, | |
| fileName: string, | |
| consumerAddress: string | |
| ): Promise<{ size: number; lastModified: number }> | |
| /** | |
| * Returns a readable stream of a bucket file's contents. ACL must be enforced | |
| * for the provided consumer address. Backs the NodePersistentStorage class. | |
| */ | |
| public abstract getReadableStream( | |
| bucketId: string, | |
| fileName: string, | |
| consumerAddress: string | |
| ): Promise<NodeJS.ReadableStream> |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/components/persistentStorage/PersistentStorageFactory.ts` around lines
200 - 218, The consumerAddress parameter in both getFileInfo and
getReadableStream abstract methods is currently optional, which allows callers
to bypass ACL enforcement by simply omitting this parameter. Make
consumerAddress a required parameter by removing the optional marker (?) from
its definition in both method signatures, or alternatively split these methods
into separate implementations where one provides explicit trusted internal
access without ACL checks and another requires consumerAddress as a mandatory
parameter for ACL-enforced access.
|
@coderabbitai full review |
✅ Action performedFull review finished. |
|
/run-security-scan |
alexcos20
left a comment
There was a problem hiding this comment.
AI automated code review (Gemini 3).
Overall risk: high
Summary:
The PR successfully integrates nodePersistentStorage into the main Storage abstraction and cleanly expands its functionality to compute algorithms. However, a critical authorization bypass vulnerability exists due to inconsistent string case validation between the ACL pre-checks and the underlying storage factory. Adding strict case-insensitive checks and defense-in-depth ACL enforcement in the new storage class is required before merging.
Comments:
• [ERROR][security] Critical Security Vulnerability (Authorization Bypass): Storage.getStorageClass() evaluates file.type case-insensitively, but the pre-checks here and throughout the file evaluate type case-sensitively.
If an attacker submits a compute job with a dataset or algorithm type: "NODEpersistentStorage", it will fail this !== 'nodePersistentStorage' check. This circumvents the bind-mount skips and ACL validations. However, it will successfully be instantiated by getStorageClass() later in the file and be fully downloaded into the container without ACL enforcement (since consumerAddress is not passed to getStorageClass in the download path).
Fix: Use a case-insensitive comparison for ALL file.type checks here and on lines 2938, 2948, and 2989.
- if (!resolved || resolved.type !== 'nodePersistentStorage') {
+ if (!resolved || resolved.type?.toLowerCase() !== 'nodepersistentstorage') {• [ERROR][security] This guard is case-sensitive. An attacker or client supplying type: "NODEpersistentStorage" would bypass this throw, but getStorageClass would still instantiate the class without a consumerAddress. This leads to a silent logic failure where the checksum evaluates to undefined instead of throwing the required error.
- if ((file as any)?.type === 'nodePersistentStorage' && !consumerAddress) {
+ if ((file as any)?.type?.toLowerCase() === 'nodepersistentstorage' && !consumerAddress) {• [WARNING][security] As a defense-in-depth measure, it is highly recommended to strictly enforce the presence of consumerAddress when attempting to read or stat persistent storage files. Because Storage.getStorageClass makes consumerAddress optional, any internal code that accidentally calls getReadableStream() on this class without it will silently bypass ACL enforcement entirely.
override async getReadableStream(): Promise<StorageReadable> {
+ if (!this.consumerAddress) {
+ throw new Error('Access Denied: consumerAddress is strictly required to read persistent storage files')
+ }
const { bucketId, fileName } = this.getFile() as PersistentStorageObjectConsider adding the same strict check to fetchSpecificFileMetadata.
• [INFO][style] Excellent defensive programming using a case-insensitive check here! To permanently resolve the security inconsistencies mentioned in compute_engine_docker.ts and utils.ts, please export this isPersistentStorageType helper (e.g., to src/@types/fileObject.ts or a shared utility) and use it universally across the codebase (including in PersistentStorageFactory.ts ACL checks).
• [WARNING][security] The user-supplied fileName is joined directly using path.join. While ensureFileExists may perform validation, it is always a best practice to explicitly sanitize filenames to prevent Path Traversal attacks (e.g., passing fileName: "../../etc/passwd").
- const targetPath = path.join(this.bucketPath(bucketId), fileName)
+ const targetPath = path.join(this.bucketPath(bucketId), path.basename(fileName))• [INFO][architecture] The resolveComputeFileObject abstraction is an excellent design improvement. It cleans up the file-resolution pipeline and prevents code duplication effectively.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/components/c2d/compute_engine_docker.ts`:
- Around line 1965-1969: The type check at line 1967 comparing resolved.type
against the literal string 'nodePersistentStorage' does not normalize the type
value before comparison, which means persistent-storage files with different
casing variants will not match and will incorrectly fall into the normal asset
provisioning path instead of being bind mounted. Normalize the resolved.type
value by converting it to lowercase (or a consistent case) before comparing it
against a normalized persistent storage type string. Apply this same
normalization fix to all other persistent-storage type checks in the file,
including the skip branches near line 1967 and the additional occurrences
mentioned in the "Also applies to" comment at lines 2939-2950 and 2988-2990.
In `@src/components/core/compute/utils.ts`:
- Around line 49-55: The type check in the guard condition is case-sensitive,
comparing against 'nodePersistentStorage', but storage resolution is
case-insensitive. This means mixed-case type values like 'NodePersistentStorage'
can bypass the guard and skip the intended error. Normalize the type value to
lowercase before performing the comparison in the conditional check to ensure
all case variations of persistent storage types are properly caught and throw
the required error for missing consumerAddress.
In `@src/components/core/handler/fileInfoHandler.ts`:
- Around line 37-47: The current implementation returns valid placeholder
metadata for persistent-storage files in the isPersistentStorageType check,
which still exposes these files through fileInfo when resolved via the
did/serviceId path. Remove this placeholder return statement and instead modify
the did/serviceId resolution branch to return a 403 Forbidden status code when
formatMetadata(...) returns null (indicating persistent storage), preventing the
placeholder metadata from being added to the response at all.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: bd82cdc1-11da-47b7-92df-10a5c436a44d
📒 Files selected for processing (19)
src/@types/PersistentStorage.tssrc/@types/fileObject.tssrc/components/c2d/compute_engine_docker.tssrc/components/core/compute/initialize.tssrc/components/core/compute/startCompute.tssrc/components/core/compute/utils.tssrc/components/core/handler/ddoHandler.tssrc/components/core/handler/downloadHandler.tssrc/components/core/handler/fileInfoHandler.tssrc/components/persistentStorage/PersistentStorageFactory.tssrc/components/persistentStorage/PersistentStorageLocalFS.tssrc/components/persistentStorage/PersistentStorageS3.tssrc/components/storage/NodePersistentStorage.tssrc/components/storage/Storage.tssrc/components/storage/getStorageClass.tssrc/components/storage/index.tssrc/test/integration/compute.test.tssrc/test/integration/persistentStorage.test.tssrc/test/unit/download.test.ts
💤 Files with no reviewable changes (1)
- src/components/core/handler/ddoHandler.ts
|
/run-security-scan |
|
@coderabbitai full review |
✅ Action performedFull review finished. |
alexcos20
left a comment
There was a problem hiding this comment.
AI automated code review (Gemini 3).
Overall risk: high
Summary:
The PR successfully elevates persistent storage to a first-class storage type, unifying its access through the Storage class and improving ACL resolutions for compute assets. However, a critical security vulnerability exists due to a combination of a case-sensitivity bypass in downloadHandler.ts and a 'fail-open' ACL check in PersistentStorageLocalFS.ts, which together allow unauthenticated downloading of any persistent storage file.
Comments:
• [ERROR][security] The check for persistent storage uses strict equality against FileObjectType.NODE_PERSISTENT_STORAGE. If an attacker submits a download command with type: 'NodePersistentStorage' (capitalized), this check will evaluate to false and the file will be passed to Storage.getStorageClass(). Since getStorageClass() performs a case-insensitive match, it will route the request to NodePersistentStorage without a consumerAddress, effectively bypassing this block and allowing unauthorized direct downloads. Use the new helper function to ensure case-insensitivity.
- if (
- (task.fileObject as { type?: string })?.type ===
- FileObjectType.NODE_PERSISTENT_STORAGE
- ) {
+ if (isPersistentStorageType((task.fileObject as { type?: string })?.type)) {• [ERROR][security] This implements a 'fail-open' mechanism for file metadata. If consumerAddress is omitted or undefined, the ACL check is entirely skipped and file metadata is returned! This allows internal functions that inadvertently omit consumerAddress to expose file information. To ensure a secure 'fail-closed' posture, this method must throw an error if consumerAddress is missing.
- if (consumerAddress) {
- await this.assertConsumerAllowedForBucket(consumerAddress, bucketId)
- }
+ if (!consumerAddress) {
+ throw new Error('Access denied: consumerAddress is required to access persistent storage')
+ }
+ await this.assertConsumerAllowedForBucket(consumerAddress, bucketId)• [ERROR][security] Similarly, getReadableStream fails open when consumerAddress is undefined, skipping the ACL check. This is exactly what makes the bypass in downloadHandler.ts so critical, as NodePersistentStorage passes undefined when instantiated outside a compute job. Enforce the ACL strictly by requiring the consumer address.
- if (consumerAddress) {
- await this.assertConsumerAllowedForBucket(consumerAddress, bucketId)
- }
+ if (!consumerAddress) {
+ throw new Error('Access denied: consumerAddress is required to access persistent storage')
+ }
+ await this.assertConsumerAllowedForBucket(consumerAddress, bucketId)• [WARNING][security] The /api/services/fileinfo route is inherently unauthenticated, but it now accepts consumerAddress from the request body to fulfill persistent storage queries. Since the backend strictly trusts this consumerAddress for ACL verification without a cryptographic signature (unlike Compute commands), any unauthenticated user can provide the address of a known bucket owner and successfully enumerate metadata (size, last modified time) of arbitrary files within that bucket. Consider whether metadata leakage is acceptable here, or require a signature validation.
• [INFO][other] Excellent defensive programming. Guarding against a missing consumerAddress here prevents unexpected downstream failures and clearly communicates the context requirement for persistent storage.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/components/core/compute/startCompute.ts (1)
211-217:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAuthorize persistent-storage inputs before checksum retrieval in paid start flow.
At Line 211,
getAlgoChecksums(...)runs before the ACL preflight loop at Line 235. For persistent-storage algorithms, unauthorized consumers can hit checksum failure first and return a generic 500 path instead of the explicit ACL denial response.Reorder so the
[task.algorithm, ...task.datasets]ACL preflight executes before checksum calculation.Also applies to: 235-248
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/core/compute/startCompute.ts` around lines 211 - 217, The getAlgoChecksums function is currently called before the ACL preflight validation loop, which causes unauthorized consumers to receive a generic 500 error instead of an explicit ACL denial response when dealing with persistent-storage algorithms. Reorder the code in the startCompute flow so that the ACL preflight check validating access to task.algorithm and task.datasets executes before the getAlgoChecksums call, ensuring authorization is confirmed before attempting checksum retrieval.src/components/core/compute/initialize.ts (1)
106-112:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRun persistent-storage ACL preflight before computing algorithm checksums.
At Line 106,
getAlgoChecksums(...)runs before the ACL preflight added at Line 228. For persistent-storage algorithms, an unauthorized consumer can fail in checksum retrieval first and return the generic checksum failure path, instead of the explicit ACL denial response fromensureConsumerAllowedForPersistentStorageLocalfsFileObject(...).Move the ACL preflight loop above
getAlgoChecksums(...)so authorization failures consistently return the intended denial response before checksum work starts.Suggested reorder
- const algoChecksums = await getAlgoChecksums( - task.algorithm.documentId, - task.algorithm.serviceId, - node, - config, - task.consumerAddress - ) - - const isRawCodeAlgorithm = task.algorithm.meta?.rawcode - const hasValidChecksums = algoChecksums.container && algoChecksums.files - - if (!isRawCodeAlgorithm && !hasValidChecksums) { - ... - } for (const elem of [task.algorithm, ...task.datasets]) { const resolvedFileObject = (await resolveComputeFileObject(elem)) ?? elem.fileObject const psAccess = await ensureConsumerAllowedForPersistentStorageLocalfsFileObject( node, task.consumerAddress, resolvedFileObject ) if (psAccess) { return psAccess } } + const algoChecksums = await getAlgoChecksums( + task.algorithm.documentId, + task.algorithm.serviceId, + node, + config, + task.consumerAddress + ) + const isRawCodeAlgorithm = task.algorithm.meta?.rawcode + const hasValidChecksums = algoChecksums.container && algoChecksums.files + if (!isRawCodeAlgorithm && !hasValidChecksums) { + ... + }Also applies to: 228-241
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/core/compute/initialize.ts` around lines 106 - 112, The ACL preflight validation that calls ensureConsumerAllowedForPersistentStorageLocalfsFileObject() currently executes at line 228, after the getAlgoChecksums() call at line 106. For persistent-storage algorithms, this allows unauthorized consumers to fail during checksum retrieval with a generic error instead of receiving the explicit ACL denial response. Move the entire ACL preflight loop that includes the ensureConsumerAllowedForPersistentStorageLocalfsFileObject() check to execute before the getAlgoChecksums() function call, ensuring authorization validation happens first and returns the intended denial response before any checksum computation work begins.src/components/httpRoutes/fileInfo.ts (1)
36-58:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDo not use a body-supplied address as the ACL principal.
consumerAddressnow controls persistent-storage metadata access, but this route reads it fromreq.body. Derive it from a verified identity such asreq.calleror require signed authentication before passing it toFileInfoHandler; otherwise callers can spoof an allowlisted address.🛡️ Minimal direction
- const consumerAddress = (req.body as { consumerAddress?: string })?.consumerAddress + const requestedConsumerAddress = (req.body as { consumerAddress?: string }) + ?.consumerAddress + const consumerAddress = req.caller + + if ( + requestedConsumerAddress && + consumerAddress && + requestedConsumerAddress.toLowerCase() !== consumerAddress.toLowerCase() + ) { + res.status(403).send('consumerAddress does not match authenticated caller') + return + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/httpRoutes/fileInfo.ts` around lines 36 - 58, The consumerAddress variable is currently being read directly from req.body without verification, which allows callers to spoof an allowlisted address and gain unauthorized metadata access. Remove the line that extracts consumerAddress from req.body and instead derive it from the verified req.caller property. Update the fileInfoTask object construction (both the if and else branches) to use the verified consumerAddress derived from req.caller rather than the unverified body value before passing it to FileInfoHandler.
🧹 Nitpick comments (2)
src/components/persistentStorage/PersistentStorageLocalFS.ts (1)
263-277: 💤 Low valueMinor inconsistency:
lastModifiednot truncated to integer.
listFiles(line 142) usesMath.floor(st.mtimeMs)forlastModified, butgetFileInfo(line 276) returnsst.mtimeMsdirectly (a float). This inconsistency could cause subtle differences when comparing timestamps across methods.♻️ Proposed fix for consistency
const targetPath = path.join(this.bucketPath(bucketId), fileName) const st = await fsp.stat(targetPath) - return { size: st.size, lastModified: st.mtimeMs } + return { size: st.size, lastModified: Math.floor(st.mtimeMs) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/persistentStorage/PersistentStorageLocalFS.ts` around lines 263 - 277, The getFileInfo method returns st.mtimeMs directly as a float value for lastModified, but the listFiles method uses Math.floor(st.mtimeMs) to return an integer. To maintain consistency across both methods, wrap st.mtimeMs with Math.floor() in the return statement of getFileInfo so that lastModified is always returned as a truncated integer value, matching the behavior in listFiles.src/components/c2d/compute_engine_docker.ts (1)
3403-3405: ⚡ Quick winMove the generic resolver out of the Docker engine module.
resolveComputeFileObjectis shared by compute pre-checks as well as Docker provisioning, so exporting it fromcompute_engine_docker.tsmakes core compute code depend on the Docker engine implementation. Move it unchanged into a shared compute/c2d utility module and update callers to import from there.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/c2d/compute_engine_docker.ts` around lines 3403 - 3405, The resolveComputeFileObject function is currently exported from compute_engine_docker.ts but is used by multiple modules including compute pre-checks and Docker provisioning, creating an inappropriate dependency on the Docker engine module from core compute code. Move this function without any changes from compute_engine_docker.ts into a new or existing shared compute utility module, then update all import statements in files that currently import resolveComputeFileObject from compute_engine_docker.ts to instead import it from the shared utility module.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/test/unit/download.test.ts`:
- Around line 218-231: The test currently only checks exact-case
nodePersistentStorage, but the storage resolution is case-insensitive, allowing
mixed-case variants like NodePersistentStorage to bypass the denial. Add a new
test case in the test suite that verifies handleDownloadUrlCommand properly
denies downloads for mixed-case file object types (such as
NodePersistentStorage), and update the handleDownloadUrlCommand handler itself
to perform case-insensitive type comparison when checking for the persistent
storage type to ensure all case variants are caught by the denial logic.
---
Outside diff comments:
In `@src/components/core/compute/initialize.ts`:
- Around line 106-112: The ACL preflight validation that calls
ensureConsumerAllowedForPersistentStorageLocalfsFileObject() currently executes
at line 228, after the getAlgoChecksums() call at line 106. For
persistent-storage algorithms, this allows unauthorized consumers to fail during
checksum retrieval with a generic error instead of receiving the explicit ACL
denial response. Move the entire ACL preflight loop that includes the
ensureConsumerAllowedForPersistentStorageLocalfsFileObject() check to execute
before the getAlgoChecksums() function call, ensuring authorization validation
happens first and returns the intended denial response before any checksum
computation work begins.
In `@src/components/core/compute/startCompute.ts`:
- Around line 211-217: The getAlgoChecksums function is currently called before
the ACL preflight validation loop, which causes unauthorized consumers to
receive a generic 500 error instead of an explicit ACL denial response when
dealing with persistent-storage algorithms. Reorder the code in the startCompute
flow so that the ACL preflight check validating access to task.algorithm and
task.datasets executes before the getAlgoChecksums call, ensuring authorization
is confirmed before attempting checksum retrieval.
In `@src/components/httpRoutes/fileInfo.ts`:
- Around line 36-58: The consumerAddress variable is currently being read
directly from req.body without verification, which allows callers to spoof an
allowlisted address and gain unauthorized metadata access. Remove the line that
extracts consumerAddress from req.body and instead derive it from the verified
req.caller property. Update the fileInfoTask object construction (both the if
and else branches) to use the verified consumerAddress derived from req.caller
rather than the unverified body value before passing it to FileInfoHandler.
---
Nitpick comments:
In `@src/components/c2d/compute_engine_docker.ts`:
- Around line 3403-3405: The resolveComputeFileObject function is currently
exported from compute_engine_docker.ts but is used by multiple modules including
compute pre-checks and Docker provisioning, creating an inappropriate dependency
on the Docker engine module from core compute code. Move this function without
any changes from compute_engine_docker.ts into a new or existing shared compute
utility module, then update all import statements in files that currently import
resolveComputeFileObject from compute_engine_docker.ts to instead import it from
the shared utility module.
In `@src/components/persistentStorage/PersistentStorageLocalFS.ts`:
- Around line 263-277: The getFileInfo method returns st.mtimeMs directly as a
float value for lastModified, but the listFiles method uses
Math.floor(st.mtimeMs) to return an integer. To maintain consistency across both
methods, wrap st.mtimeMs with Math.floor() in the return statement of
getFileInfo so that lastModified is always returned as a truncated integer
value, matching the behavior in listFiles.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 5e8fae36-ed22-426b-9054-d5aa2a62505b
📒 Files selected for processing (21)
src/@types/PersistentStorage.tssrc/@types/commands.tssrc/@types/fileObject.tssrc/components/c2d/compute_engine_docker.tssrc/components/core/compute/initialize.tssrc/components/core/compute/startCompute.tssrc/components/core/compute/utils.tssrc/components/core/handler/ddoHandler.tssrc/components/core/handler/downloadHandler.tssrc/components/core/handler/fileInfoHandler.tssrc/components/httpRoutes/fileInfo.tssrc/components/persistentStorage/PersistentStorageFactory.tssrc/components/persistentStorage/PersistentStorageLocalFS.tssrc/components/persistentStorage/PersistentStorageS3.tssrc/components/storage/NodePersistentStorage.tssrc/components/storage/Storage.tssrc/components/storage/getStorageClass.tssrc/components/storage/index.tssrc/test/integration/compute.test.tssrc/test/integration/persistentStorage.test.tssrc/test/unit/download.test.ts
💤 Files with no reviewable changes (1)
- src/components/core/handler/ddoHandler.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/components/persistentStorage/PersistentStorageLocalFS.ts (1)
257-260: 💤 Low valueMisleading comment and unnecessary conditional for required parameter.
The abstract interface declares
consumerAddress: string(required), matching this signature. However, the comment states checksum "can be obtained without consumerAddress" and the code uses a conditional check. Since the parameter is required, theifblock always executes for valid callers.Remove the conditional to align with the contract, or update the abstract signature if optional behavior is intended.
Suggested fix to align with required parameter contract
await this.ensureBucketExists(bucketId) - // file checksum can be obtained without consumerAddress, but if provided, it will be validated for access. - if (consumerAddress) { - await this.assertConsumerAllowedForBucket(consumerAddress, bucketId) - } + await this.assertConsumerAllowedForBucket(consumerAddress, bucketId) await this.ensureFileExists(bucketId, fileName)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/persistentStorage/PersistentStorageLocalFS.ts` around lines 257 - 260, The consumerAddress parameter is declared as required in the abstract interface, but the implementation includes an unnecessary if (consumerAddress) conditional check and a misleading comment suggesting the checksum can be obtained without it. Remove the conditional wrapper around the this.assertConsumerAllowedForBucket(consumerAddress, bucketId) call to make it execute unconditionally, and update or remove the misleading comment to accurately reflect that consumerAddress is a required parameter. This aligns the implementation with the required parameter contract defined in the abstract interface.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/components/persistentStorage/PersistentStorageLocalFS.ts`:
- Around line 257-260: The consumerAddress parameter is declared as required in
the abstract interface, but the implementation includes an unnecessary if
(consumerAddress) conditional check and a misleading comment suggesting the
checksum can be obtained without it. Remove the conditional wrapper around the
this.assertConsumerAllowedForBucket(consumerAddress, bucketId) call to make it
execute unconditionally, and update or remove the misleading comment to
accurately reflect that consumerAddress is a required parameter. This aligns the
implementation with the required parameter contract defined in the abstract
interface.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0f6629ee-e160-4e19-9e3f-ea36f394d801
📒 Files selected for processing (4)
src/components/core/compute/startCompute.tssrc/components/core/handler/downloadHandler.tssrc/components/persistentStorage/PersistentStorageLocalFS.tssrc/test/unit/download.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- src/test/unit/download.test.ts
- src/components/core/handler/downloadHandler.ts
- src/components/core/compute/startCompute.ts
Persistent storage in Compute (C2D)
Makes node persistent storage (localfs buckets) usable as datasets and algorithms in compute jobs, and turns
nodePersistentStorageinto a first-class storage type instead of a scattered special case.What changed
documentId+serviceId). Previously only a plaintext file object worked, and persistent-storage algorithms were rejected outright.nodePersistentStorageis now a first-class storage class (alongsideurl,ipfs,s3, …).Storage.getStorageClass()returns a working backend for it, so algorithm provisioning and checksum computation go through the same path as every other storage type — removing the special-case branches.persistentStorage.options.folderproduced a relative Docker bind-mount path and failed container creation (mount path must be absolute). The folder is now resolved to an absolute path.Technical notes
resolveComputeFileObjecthelper resolves the effective (decrypted/DDO-derived) file object, reused by the compute engine and the initialize/start pre-checks.getReadableStream/getFileInfo/getFileChecksum; the algorithm trusted-checksum gate uses a real content hash so trust matching still works.Tests
Added integration/unit coverage for: encrypted and published persistent-storage datasets, persistent-storage algorithms, mixed persistent/non-persistent datasets, ACL denial, the storage class itself, absolute bind-mount paths, and download denial.
Summary by CodeRabbit